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IN THE CLAIMS: 

Please amend the claims as follows: 

1 . (Currently Amended): A method for processing a group of related divergent 
graphics samples in a programmable graphics processing unit having a recirculating 
pipeline implemented as a single instruction multiple data (SIMP) architecture , the 
method comprising: 

configuring each of a plurality of a programmable computation units by a field of 
codewords to perform an operation on multiple samples, 

incrementing a subroutine depth of a first sample of the related divergent 
samples t o designate that_a first call instructionrfsll-afe and a first return instruction are 
to be executed on the first sample; 

pushing state data of a second sample of the related divergent samples upon 
which the first call instruct i ons ar e and the first return instructions are not to be executed 
onto a global stack to define the second sample as idle ; afld 

dispatching a token associated with the group of samples into the pipeline along 
with all samples in the group of related divergent samples, 

executing the first call instruction and the first return instructional! on the first 
sample , but not the second sample; and Q 

storing the processed divergent samples for output or display. 

2. (Currently Amended): The method of claim 1 , further comprising the step of 
holding the second sample idle and associating a sample depth in a sample depth score 
board with the first sample and each sample of the group of related samples, wherein 
the sample depth represents the number of call/return cycles to be executed on each of 
the samples . 

3. (Currently Amended): The method of claim 2, wherein holding the second 
sample idle comprises encoding the second sample with non-operation information and 
pushing the second sample onto the global stack encoded with information that no 
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operations are to be performed on the second sample, removing the second sample 
from the group of divergent samples on which operations are performed . 

4. (Currently Amended): The method of claim 1 , further comprising the step of 
determining whether the first call i nstructions include instruction includes a call return 
that contains_a second call instructions , and modifying the state data of each of the 
samples to indicate a number of call return associated with each of the samples . 

5. (Currently Amended): The method of claim 4, further comprising the step of 
incrementing the subroutine depth of the first sample to designate that second call 
instructions are to be executed on the first sample , and documenting the sample depth 
of the second sample and testing the sample depth of all samples of the group . 

6. (Currently Amended): The method of claim 5, further comprising the step of 
executing the second call instructions on the first sample. 

7. (Currently Amended): The method of claim 1 , including comparing the state 
data associated with each sample in the group to identify one or more of the samples 
with the greatest subroutine depth, and wh e rein push i ng state data removes removing 
the second samp le samples from Kail the working set of data which do not have the 
greatest subroutine depth . 

8. (Currently Amended): The method of claim [[1]]_7, further comprising the 
step of determining whether an instruction in an instruction sequence includes a call- 
return that contains the first call instructions , and dispatching a new token into the 
pipeline with the group of samples after each comparison of the state data . 

9. (Currently Amended): A method of processing a group of related divergent 
graphics samples in a programmable graphics processing unit having a recirculating 
pipeline embodied as a single instruction multiple data (SIMP) architecture , the method 
comprising: 

configuring each of a plurality of a programmable computation units by a 
field of codewords to perform an operation on multiple samples of the groups. 
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identifying a first sample of the group of related samples having a first 
subroutine depth; 

holding idle a second sample having a second subroutine depth, the first 
subroutine depth being greater than the second subroutine depth; 

dispatching a token associated with the samples of the group into the pipeline 
along with all the group of samples 

executing operations specified in first return instructions on the first sample;-and 

popping stat e data of a s e cond sampl e from a g l oba l stac k comparing the sample 
depth of all the samples of the groups of related samples; and 

executing an operation specified in the token on samples of the groups of related 
samples having the greatest subroutine depth rr.11 ; and 

storing the processed divergent samples for output or display. 

10. (Original): The method of claim 9, wherein holding idle a second sample 
comprises encoding the second sample with non-operation information. 

1 1 . (Currently Amended): The method of claim [[9]], wh e r ei n popp i ng tho stato 
data of th o socond samp le r e sto re s t he s e cond sample to a work i ng s o t of data wherein 
holding the second sample idle comprises encoding the second sample with non- 
operation information and pushing the second sample onto the global stack encoded 
with information that no operations are to be performed on the second sample, and 
removing the second sample from the group of divergent samples on which operations 
are performed . 

1 2. (Original): The method of claim 9, further comprising the step of decrementing 
the first subroutine depth, making the first subroutine depth equal to the second 
subroutine depth. 

1 3. (Original): The method of claim 12, further comprising the step of determining 
whether the first subroutine depth is greater than zero. 
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14. (Original): The method of claim 13, further comprising the step of identifying in 
an instruction sequence a next instruction to be executed if the first subroutine depth is 
equal to zero. 

1 5. (Original): The method of claim 13, further comprising the step of executing 
second return instructions on the first sample and the second sample if the first 
subroutine depth is greater than zero. 

1 6. (Original): The method of claim 1 5, further comprising the step of 
decrementing the first subroutine depth and the second subroutine depth. 

1 7. (Currently Amended): A system for processing a group of related divergent 
graphics samples in a programmable graphics processing unit having a recirculating 
pipeline implemented as a single instruction multiple data (SIMP) architecture , the 
system comprising: 

configuring each of a plurality of programmable computation units bv a field of 
codeswords to perform an operation on multiple samples of the groups: 

a subroutine depth scoreboard configured to store a fifst subroutine depth 
corresponding to a first each sample of the groups of related samples and a socond 
subroutine dopth corresponding to a s e cond samp le of the : 

a global stack configured to store state data related to each sample of the group 
of related samples : and 

a remap configured to compare to subroutine depth of each of the samples of the 
group of related samples and to increment and decrement the f i rst subroutine dopth and 
th e s e cond subroutine depth in the subroutine depth scoreboard and to push state data 
onto and to pop state data from the global stac k based on the decision as to which of 
the samples of the group of samples have the greatest subroutine depth . 

1 8. (Currently Amended): The system of claim 1 7, wherein the remap is further 
configured to determine that first call instructions are to be executed on the first samp l e 
samples selected as having the greatest subroutine depth , but not the second samp l e 
samples of the group identified as not having the greatest subroutine depth , to 
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increment the first subroutine depth to designate that the first call instructions are to be 
executed on the first-sampl e samples , and to push state data of the second samp l e 
samples onto the global stack. 

1 9. (Currently Amended): The system of claim 1 8, wherein the remap is further 
configured to encode the second samp le samples with non-operation data, to generate 
a PC token for executing the first call instructions, the PC token containing one or more 
codewords that configure programmable computation units within a recirculating shader 
pipeline to execute the first call instructions, and to dispatch the PC token into the 
recirculating shader pipeline, followed by the first sample samples and the second 
sample samples . 

20. (Currently Amended): The system of claim 17, wherein the remap is further 
configured to determine that first return instructions are to be executed on the first 
sample samples , but not the second sampl e samples , to encode the second samp l e 
samples with non-operation data, to generate a PC token for executing the first return 
instructions, the PC token containing one or more codewords that configure 
programmable computation units within [fall the recirculating shader pipeline to execute 
the first return instructions, and to dispatch the PC token into the recirculating shader 
pipeline, followed by the first samp l o samples and the second samp le samples . 

21 . (Currently Amended): The system of claim 20, wherein the remap is further 
configured to decrement the first subroutine depth and to pop state data of the second 
sampl e samples from the global stack. 

22. (Currently Amended): A system for processing a group of related divergent 
graphics samples in a programmable graphics processing unit, the system comprising: 

configuring each of a plurality of a programmable computation units by a field of 
codewords to perform an operation on multiple samples of the groups 

means for incrementing a first subroutine depth of a firs t set of samp lo samples 
of the groups of samples to designate that first call instructions are to be executed on 
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the first set of s ampl e samples based on identifying the first set of samples having a 
greater subroutine depth than any sample of the second set of samples ; 

means for maintaining a score board of subroutine depth for each sample of the 
groups of related samples; 

means for comparing the subroutine depth of every sample of the groups of 
samples prior to executing each call instruction on any of the samples; 

means for pushing state data of a second sample upon which the first call 
instructions are not to be executed onto a global stack; 

means for dispatching all the samples of the groups of samples through the 
pipeline with a token for configuring the pipeline after each comparison of the subroutine 
depths of each of the samples: 

means for executing the first call instructions on the first sample; 

m e ans for identify i ng that th o first subrout i n e depth i s great e r than a second 
subroutin e d e pth of th e s e cond samp le; 

means for executing first return instructions on the first sample; 

means for decrementing the first subroutine depth; and 

means for popping state data of the second sample from the global stack. 

23. (Original): The system of claim 22, further comprising means for holding the 
second sample idle. 

24. (Original): The system of claim 23, wherein means for holding comprises 
encoding the second sample with non-operation information. 

25. (Original): The system of claim 22, further comprising means for determining 
whether the first subroutine depth is greater than zero and means for executing second 
return instructions on the first sample and the second sample if the first subroutine 
depth is greater than zero. 
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